release-21.2: flowinfra: make max_running_flows default depend on the number of CPUs #75509
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport 1/1 commits from #71787.
/cc @cockroachdb/release
We think that it makes sense to scale the default value for
max_running_flows
based on how beefy the machines are, so we make it amultiple of the number of available CPU cores. We do so in
a backwards-compatible fashion by treating the positive values of
sql.distsql.max_running_flows
as absolute values (the previousmeaning) and the negative values as multiples of the number of the CPUs.
The choice of 128 as the default multiple is driven by the old default
value of 500 and is such that if we have 4 CPUs, then we'll get the value
of 512, pretty close to the old default.
Informs: #34229.
Release note (ops change): The meaning of
sql.distsql.max_running_flows
cluster setting has been extended sothat when the value is negative, it would be multiplied by the number of
CPUs on the node to get the maximum number of concurrent remote flows on
the node. The default value is -128, meaning that on a 4 CPU machine we
will have up to 512 concurrent remote DistSQL flows, but on a 8 CPU
machine up to 1024. The previous default was 500.
Release justification: low-risk change to reduce the number of "no inbound
stream connection" errors.